-
-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial code improvements with respect to Clang 19 warnings #1516
base: main
Are you sure you want to change the base?
Conversation
90d5eeb
to
2b526ba
Compare
032721c
to
161c4ac
Compare
size_t columns = HeaderLayout_getColumns(this->scr->header->headerLayout); | ||
unsigned int columns = HeaderLayout_getColumns(this->scr->header->headerLayout); | ||
MetersPanel** meterPanels = xMallocArray(columns, sizeof(MetersPanel*)); | ||
Settings* settings = this->host->settings; | ||
|
||
for (size_t i = 0; i < columns; i++) { | ||
for (unsigned int i = 0; i < columns; i++) { | ||
char titleBuffer[32]; | ||
xSnprintf(titleBuffer, sizeof(titleBuffer), "Column %zu", i + 1); | ||
xSnprintf(titleBuffer, sizeof(titleBuffer), "Column %u", i + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one is a step backwards. Stick with size_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't understand what you mean. What's the reason of keeping the size_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two-fold:
- It's a number of items (columns)
unsigned int
looks kinda unwieldy IMHO.- For the format string
%zu
is much more portable thanuint32_t
which would require you to use%"PRIu32"
instead of just%u
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenBE Well, I have a different view on this:
- Object sizes with no likelihood of overflowing an
int
type would not needsize_t
(that is,unsigned long
) for storing. - In x86-64, incrementing and comparing
int
types are slightly cheaper than comparinglong
orsize_t
(in terms of code size). Whenint
types can suffice, usinglong
would make the code one-byte-per-operation larger. - Header column count is unlikely to exceed termical column count (
COLS
), and the latter is bound tosigned int
type already. - And no, we don't need
PRIu32
here. It'sunsigned int
, notuint32_t
.
This case is slightly different from string lengths. My view is string lengths in size_t
can serve as a defence in depth, where the user is possible to feed a 4 GiB large string in a config file. But if the object size would be bound by any other factor, then size_t
might be a waste.
(By the way, I did read this article from EWONTFIX when I write my changes.)
I hope htop developers can reconsider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personal opinion following:
@BenBE Well, I have a different view on this:
- Object sizes with no likelihood of overflowing an
int
type would not needsize_t
(that is,unsigned long
) for storing.
Nonetheless size_t
is less about the number of entities itself, but that you are counting entities.
Otherwise you could argue that "this hashtable implementation only needs short
for its size, because I never store more than 65535 elements in it".
I know, that example is hyperbole, but follows that same logic. When I ask for size_t
it's really little about "how many", but the "what" (AKA semantics).
- In x86-64, incrementing and comparing
int
types are slightly cheaper than comparinglong
orsize_t
(in terms of code size). Whenint
types can suffice, usinglong
would make the code one-byte-per-operation larger.
That might be true for x86_64, but that's not the only platform we need to support. There's at least ARM, ARM64, MIPS, and probably several others, where htop is runing. Unless performance really IS a bottleneck for these functions (AFAICS it's not), priority should be on correctness, not code size or performance.
- Header column count is unlikely to exceed terminal column count (
COLS
), and the latter is bound tosigned int
type already.
See arguments from the article you linked.
That limit is implied by the current interface from ncurses, but a future version could just switch to size_t for its column and row counts (potentially breaking BC at that point); there are good reasons like huge screens where terminals COULD have that many columns/rows.
- And no, we don't need
PRIu32
here. It'sunsigned int
, notuint32_t
.
k, ACK on that. But that's only a minor note regarding the chosen type signature.
This case is slightly different from string lengths. My view is string lengths in
size_t
can serve as a defence in depth, where the user is possible to feed a 4 GiB large string in a config file. But if the object size would be bound by any other factor, thensize_t
might be a waste.
Allocations are typically backed by 4 KiB pages at minimum, with the usual memory manager implementation pre-allocating some memory to handle sudden spikes. Thus unless the savings are major or we have massive amounts of objects (where storing UTF-8 instead of UTF-16 saves several hundreds of megabytes (like with some browsers) there's usually little need to save every last byte. Furthermore, by storing smaller data types you sometimes may make the alignment of the data worse, thus actually reducing performance.
From my experience I can tell, that I've seen real-life cases of where uint16_t
vs. uint32_t
was actually harmful due to alignment and memory access patterns. For 32 bit values on a 64 bit architecture, similar considerations apply.
(By the way, I did read this article from EWONTFIX when I write my changes.)
FWIW, I personally side with the author.
I hope htop developers can reconsider.
@cgzones @fasterit @ginggs @natoscott : Your opinion on this matter of size_t
vs unsigned int
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to stick with size_t here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconded
/DLange
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenBE @natoscott @fasterit
This commit was to fix the implicit downcast warnings from HeaderLayout_getColumns()
by Clang 19 (the messages was: "implicit conversion loses integer precision: 'size_t' (aka 'unsigned long') to 'unsigned int' [-Wshorten-64-to-32]
"). HeaderLayout_getColumns()
was an inline function that returns a size_t
that, in certain caller codes, has to downcast to int
because terminal column variables are in int
type to follow curses
API.
With this background, I would like some clarifications on which decision to make:
(A) Keep HeaderLayout_getColumns()
return value in size_t
. Use size_t
for local variables that hold the return values of HeaderLayout_getColumns()
, and add explicit downcasts when needed (e.g. when used in the context of terminal column widths).
(B) Let HeaderLayout_getColumns()
return an unsigned int
. Keep using int
type for places like terminal columns. But when using in array iterators (like in this example code you guys are commented on), upcast to size_t
.
(C) Let HeaderLayout_getColumns()
return an unsigned int
. Also use unsigned int
for array iterators that would never cross the bound of HeaderLayout_getColumns()
. This is my original proposal.
EDIT: Just a reminder that the (A) option is the least preferred to me as the downcasts can hide potential errors when the original "getColumns" values are out of bounds. Although assertions may be added to ensure the numbers are in bound, they still look like an overkill solution to me (making the problem more complicated than necessary).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A.
Add explicit casts to fix the "-Wshorten-64-to-32" warning (enabled by default in Clang 19). Also shorten the type width for some local variables (where "uint32_t" is enough and no need for "unsigned long"). Signed-off-by: Kang-Che Sung <[email protected]>
Don't assume a meter's caption string length will fit the type of 'int'. Use 'size_t' for the string lengths instead. Even though an overflow on the string length is unlikely, this makes the code more robust. Signed-off-by: Kang-Che Sung <[email protected]>
Change return type of HeaderLayout_getColumns() from 'size_t' to 'unsigned int', and use 'unsigned int' type for all Header column iterators. Using 'size_t' is unnecessary (you cannot have 2^32 columns anyway) and would cause "-Wshorten-64-to-32" warnings (enabled by default in Clang 19).
b23a17d
to
3718c39
Compare
unsigned int minutes = totalSeconds / 60; | ||
unsigned int seconds = totalSeconds % 60; | ||
unsigned int milliseconds = microseconds / 1000; | ||
unsigned int minutes = (unsigned int)totalSeconds / 60; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the cast here happen after the division?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary because of the (totalSeconds < 600)
conditional. Although I argue this is more of a style issue...
(Compiler should catch that (unsigned int)(totalSeconds / 60)
can be optimized to ((unsigned int)totalSeconds) / 60
, no?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily, as (unsigned int)((1UL<<32) / 60)
!= (unsigned int)(1UL<<32) / 60
.
Won't happen in this case based on the previous check, but the correctness should take priority here.
@@ -318,25 +325,25 @@ static void LEDMeterMode_draw(Meter* this, int x, int y, int w) { | |||
y + 2; | |||
attrset(CRT_colors[LED_COLOR]); | |||
const char* caption = Meter_getCaption(this); | |||
mvaddstr(yText, x, caption); | |||
int xx = x + strlen(caption); | |||
mvaddnstr(yText, x, caption, w); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is a little beyond my knowledge. According to man pages, the n
in mvaddnstr
refers to number of "characters" to print. However, what I really want to limit is the number of terminal columns, not the number of Unicode characters (this distinction is needed if East Asian Wide characters are in the strings).
Perhaps a good way is to test how this function call would output, would it be "12
" or just "1
":
/* "1234 fullwidth" */
addnstr("\uFF11\uFF12\uFF13\uFF14 fullwidth", 2);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about wcswidth(3)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I wish is a function that does an inverse of wcswidth(3)
. wcswidth()
returns the terminal column width given a string and a number of (Unicode) characters limit. What I wished is retrieving the number of characters (or number of bytes) given a string and a column width limit. RichString_appendnWideColumns()
might give the closest I need, but I want a function that operates on a const char*
rather than a RichString
object. Maybe I would write one from scratch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately that is easier said than actual done, but based on the fact that wcswidth
returns >=0 you can probably build something like this by calling wcwidth(3)
for each character in turn until you overshoot the limit, in which case you know the last character was too much (this accounts for zero-width-combiners and other special cases). The down-side is, that you will need to transform the const char*
into wchar_t*
first, thus having this functionality live with the other RichString functions might be reasonable.
NB: There's actually an issue that might benefit from this functionality in #462 IIRC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch contains my proposal of the code quality changes with respect to Clang 19 warnings. (Notably,
-Wshorten-64-to-32
is enabled by default in Clang 19.) I would try to break these into small changes so that these commits may be cherry-picked intomain
early.(This pull request is related to #1513)